[core] Fix silent data corruption due to overflow in FieldProductAgg by ArnavBalyan · Pull Request #7906 · apache/paimon

ArnavBalyan · 2026-05-19T13:11:08Z

Purpose

FieldProductAgg multiplies integers without overflow detection.
Accumulator silently wraps corrupting downstream product aggregations.
Throw on overflow using Math multiplyExact and range checks.
Previously the data corruption was silently ignored. However with this change, an overflow throws a hard exception. Downstream consumers should be aware.

Tests

UT

ArnavBalyan · 2026-05-19T13:11:34Z

cc @JingsongLi thanks! :)

ArnavBalyan · 2026-05-21T05:12:32Z

cc @leaves12138 @JingsongLi can you pls review thanks!

JingsongLi

Review of FieldProductAgg overflow detection

The intent is good — silent integer overflow during aggregation is indeed a correctness bug that corrupts data. The agg() fix for INTEGER/BIGINT using Math.multiplyExact is clean. The toExactByte/toExactShort helpers are correct since Java promotes byte * byte and short * short to int before the multiplication, so the int result can be range-checked without itself overflowing.

Issues

1. retract() method has the same overflow bug but is not fixed

paimon-core/src/main/java/org/apache/paimon/mergetree/compact/aggregate/FieldProductAgg.java

The retract method performs division, which can also overflow silently in the same cases:

(byte) ((byte) Byte.MIN_VALUE / (byte) -1) → result is 128, which wraps to -128 when cast to byte
(short) ((short) Short.MIN_VALUE / (short) -1) → result is 32768, wraps to -32768
(int) Integer.MIN_VALUE / (int) -1 → wraps to Integer.MIN_VALUE in Java
(long) Long.MIN_VALUE / (long) -1 → wraps to Long.MIN_VALUE in Java

If the goal of this PR is to eliminate silent data corruption from overflow, the retract() path should get the same treatment for consistency. At minimum, the TINYINT and SMALLINT cases need the same range-check pattern since the int promotion makes the overflow detectable. For INTEGER and BIGINT, you would need explicit checks (if (accumulator == Integer.MIN_VALUE && inputField == -1) etc.) since Java's / operator does not throw.

2. Exception message could include operand values for debuggability

When a compaction task fails with ArithmeticException("byte overflow"), the operator has no idea which values caused it. Consider including the operands:

throw new ArithmeticException(
    String.format("byte overflow: %d * %d = %d", (byte) accumulator, (byte) inputField, value));

This is especially important because this exception will surface during compaction, potentially long after the data was written, making reproduction difficult.

3. Consider the failure mode during compaction

An unchecked ArithmeticException thrown during compaction will fail the compaction task. Depending on the retry/error-handling policy, this could block compaction indefinitely for a partition that has overflowing values. This is arguably better than silent corruption, but it would be worth documenting this behavioral change (e.g., in the PR description or release notes) so users understand that previously-silent overflows will now cause task failures.

Minor notes

The test cases are well chosen (boundary values, both positive and negative overflow). Good coverage.
FLOAT/DOUBLE overflow to infinity is not addressed, which is fine — that is standard IEEE 754 semantics and expected for floating-point types.
FieldSumAgg has the same class of overflow bugs, but that is out of scope for this PR.

Overall this is a useful correctness fix. The main request is to apply the same protection to retract() for completeness, since a half-fix leaves an inconsistency where agg() detects overflow but retract() on the same type silently corrupts.

ArnavBalyan · 2026-05-23T15:03:34Z

Thanks for the review! Have addressed all comments:

retract() method has the same overflow bug but is not fixed
- Covered all cases, division and multiplication are now guarded and will not silently corrupt data.
Exception message could include operand values for debuggability
- Made the exception more clear and descriptive.
Consider the failure mode during compaction
- Added more documentation to this PR for posterity.

JingsongLi

Review of FieldProductAgg overflow detection

Overall: The PR addresses a real correctness issue — integer multiplication/division overflow silently wrapping and corrupting aggregation results. The approach is sound and the fix is complete across both agg() and retract() paths.

What's good

Correctness: The overflow detection logic is correct for all four integer types. byte/short use int-promoted arithmetic with range checks (safe because int can hold the full product/quotient range). int/long use Math.multiplyExact for multiply and explicit MIN_VALUE / -1 guards for divide.
Error messages: Include operand values, which is critical for debugging compaction failures.
Test coverage: Tests cover multiply overflow on all 4 integer types + retract (division) overflow on all 4 types, with both positive and negative overflow cases.
Float/Double left alone: Correct — IEEE 754 overflow to infinity is expected behavior.

Minor suggestions

1. multiplyExactInt/multiplyExactLong — catch-rethrow pattern loses the original exception cause

} catch (ArithmeticException e) {
    throw new ArithmeticException(String.format("int overflow: %d * %d", a, b));
}

The new exception doesn't chain e as a cause. ArithmeticException lacks a cause constructor, but initCause() works:

ArithmeticException ex = new ArithmeticException(String.format("int overflow: %d * %d", a, b));
ex.initCause(e);
throw ex;

2. divideExactByte/divideExactShort — consider aligning with int/long pattern for consistency

The only case where integer division can overflow is MIN_VALUE / -1. The general value > MAX || value < MIN range check works correctly, but you could match the explicit MIN_VALUE && -1 check pattern used by divideExactInt/divideExactLong for consistency across all four types.

3. Behavioral change during compaction

An ArithmeticException during compaction will crash the task. This is better than silent corruption, but for existing tables that already contain overflowing data, upgrading Paimon could cause compaction to fail on those partitions with no easy recovery path. Worth noting in release notes.

4. FieldSumAgg has the same class of bug

FieldSumAgg also does unchecked (byte) accumulator + (byte) inputField etc. with the same silent overflow risk. Out of scope for this PR, but worth opening a follow-up issue.

JingsongLi reviewed May 23, 2026

View reviewed changes

upadte

0885945

ArnavBalyan force-pushed the arnavb/fix-product-overflow branch from 53057d1 to 0885945 Compare May 23, 2026 15:00

JingsongLi reviewed May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Fix silent data corruption due to overflow in FieldProductAgg#7906

[core] Fix silent data corruption due to overflow in FieldProductAgg#7906
ArnavBalyan wants to merge 1 commit into
apache:masterfrom
ArnavBalyan:arnavb/fix-product-overflow

ArnavBalyan commented May 19, 2026 •

edited

Loading

Uh oh!

ArnavBalyan commented May 19, 2026

Uh oh!

ArnavBalyan commented May 21, 2026

Uh oh!

JingsongLi left a comment

Uh oh!

ArnavBalyan commented May 23, 2026 •

edited

Loading

Uh oh!

JingsongLi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ArnavBalyan commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Tests

Uh oh!

ArnavBalyan commented May 19, 2026

Uh oh!

ArnavBalyan commented May 21, 2026

Uh oh!

JingsongLi left a comment

Choose a reason for hiding this comment

Review of FieldProductAgg overflow detection

Issues

Minor notes

Uh oh!

ArnavBalyan commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JingsongLi left a comment

Choose a reason for hiding this comment

Review of FieldProductAgg overflow detection

What's good

Minor suggestions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ArnavBalyan commented May 19, 2026 •

edited

Loading

ArnavBalyan commented May 23, 2026 •

edited

Loading